Intro to lattice and irregular lattice
inla.doc("besag")
In this example we use the North Carolina Sudden Infant Death Syndrome dataset.
data(nc.sids)
head(nc.sids)
## CNTY.ID BIR74 SID74 NWBIR74 BIR79 SID79 NWBIR79 east north x
## Ashe 1825 1091 1 10 1364 0 19 164 176 -81.67
## Alleghany 1827 487 0 10 542 3 12 183 182 -50.06
## Surry 1828 3188 5 208 3616 6 260 204 174 -16.14
## Currituck 1831 508 1 123 830 2 145 461 182 406.01
## Northampton 1832 1421 9 1066 1606 3 1197 385 176 281.10
## Hertford 1833 1452 7 954 1838 5 1237 411 176 323.77
## y lon lat L.id M.id
## Ashe 4052.29 -81.48594 36.43940 1 2
## Alleghany 4059.70 -81.14061 36.52443 1 2
## Surry 4043.76 -80.75312 36.40033 1 2
## Currituck 4035.10 -76.04892 36.45655 1 4
## Northampton 4029.75 -77.44057 36.38799 1 4
## Hertford 4028.10 -76.96474 36.38189 1 4
hist(nc.sids$SID74)
summary(nc.sids)
## CNTY.ID BIR74 SID74 NWBIR74
## Min. :1825 Min. : 248 Min. : 0.00 Min. : 1.0
## 1st Qu.:1902 1st Qu.: 1077 1st Qu.: 2.00 1st Qu.: 190.0
## Median :1982 Median : 2180 Median : 4.00 Median : 697.5
## Mean :1986 Mean : 3300 Mean : 6.67 Mean :1051.0
## 3rd Qu.:2067 3rd Qu.: 3936 3rd Qu.: 8.25 3rd Qu.:1168.5
## Max. :2241 Max. :21588 Max. :44.00 Max. :8027.0
## BIR79 SID79 NWBIR79 east
## Min. : 319 Min. : 0.00 Min. : 3.0 Min. : 19.0
## 1st Qu.: 1336 1st Qu.: 2.00 1st Qu.: 250.5 1st Qu.:178.8
## Median : 2636 Median : 5.00 Median : 874.5 Median :285.0
## Mean : 4224 Mean : 8.36 Mean : 1352.8 Mean :271.3
## 3rd Qu.: 4889 3rd Qu.:10.25 3rd Qu.: 1406.8 3rd Qu.:361.2
## Max. :30757 Max. :57.00 Max. :11631.0 Max. :482.0
## north x y lon
## Min. : 6.0 Min. :-328.04 Min. :3757 Min. :-84.08
## 1st Qu.: 97.0 1st Qu.: -60.55 1st Qu.:3920 1st Qu.:-81.20
## Median :125.5 Median : 114.38 Median :3963 Median :-79.26
## Mean :122.1 Mean : 91.46 Mean :3953 Mean :-79.51
## 3rd Qu.:151.5 3rd Qu.: 240.03 3rd Qu.:4000 3rd Qu.:-77.87
## Max. :182.0 Max. : 439.65 Max. :4060 Max. :-75.67
## lat L.id M.id
## Min. :33.92 Min. :1.00 Min. :1.00
## 1st Qu.:35.26 1st Qu.:1.00 1st Qu.:2.00
## Median :35.68 Median :2.00 Median :3.00
## Mean :35.62 Mean :2.12 Mean :2.67
## 3rd Qu.:36.05 3rd Qu.:3.00 3rd Qu.:3.25
## Max. :36.52 Max. :4.00 Max. :4.00
nc.sids <- st_read(system.file("shapes/sids.shp", package="spData"))
## Reading layer `sids' from data source
## `/Library/Frameworks/R.framework/Versions/4.4-x86_64/Resources/library/spData/shapes/sids.shp'
## using driver `ESRI Shapefile'
## Simple feature collection with 100 features and 22 fields
## Geometry type: MULTIPOLYGON
## Dimension: XY
## Bounding box: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
## CRS: NA
We see that the number of deaths is a count variable and therefore propose a Poisson regression model, i.e. \[y\sim Poi(E\theta)\] with \[\theta = \exp(\eta)\] and \(E\) is the expected count. In this way, \(\theta\) indicates the risk. We need to calculate the expected count for each observation,
p_hat <- sum(nc.sids$SID74) / sum(nc.sids$BIR74)
nc.sids$E74 <- p_hat * nc.sids$BIR74
We use the NWBIR as a covariate to investigate if it has any influence on the risk of SIDS. We consider the proportion instead of the count,
nc.sids$nwp_hat74 <- nc.sids$NWBIR74 / nc.sids$BIR74
Adding a random intercept adds an independent effect, although the counties are probably not independent in terms of socio-economic diversity. So we should rather include a structured random effect such that some “intercepts” are correlated with each other, conveying the dependence in space. Since we have space divided into discrete non-overlapping parts, we can use a Besag or BYM model.
#Add besag for counties - a spatial model
#Graph - neighbourhood structure
nc.adj <- poly2nb(nc.sids)
B.nc <- nb2mat(nc.adj, style = "B")
plot(nc.sids$geometry, border="grey")
plot(nc.adj, coords = nc.sids$geometry, add = TRUE, col = "blue")
#Add covariate to spatial dataset
nc.sids$CNTY_ID2 <- 1:(length(nc.sids$CNTY_ID))
#Fit the spatial model
result3_sids <- inla(SID74 ~ nwp_hat74 + f(CNTY_ID2, model = "besagproper", graph = B.nc),
data = as.data.frame(nc.sids),
family = "poisson",
control.predictor = list(compute = TRUE),
E = E74)
The posterior summary is
summary(result3_sids)
## Time used:
## Pre = 4.59, Running = 1.08, Post = 0.101, Total = 5.77
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) -0.652 0.127 -0.910 -0.650 -0.405 -0.650 0
## nwp_hat74 1.883 0.283 1.332 1.881 2.449 1.881 0
##
## Random effects:
## Name Model
## CNTY_ID2 Proper version of Besags ICAR model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant 0.975quant mode
## Precision for CNTY_ID2 7.46 6.90 1.502 5.47 25.79 3.248
## Diagonal for CNTY_ID2 1.08 1.16 0.097 0.73 4.16 0.268
##
## Marginal log-Likelihood: -227.11
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
The two extra hyperparameters from the besagproper effect are the marginal standard deviation, \(\sigma\) and the properness parameter, \(d\), and their marginal posteriors are
res3_sd_bym <- inla.tmarginal(function(x) sqrt(1/x), result3_sids$marginals.hyperpar$`Precision for CNTY_ID2`)
plot(res3_sd_bym, type = "l", xlab = expression(sigma), ylab = expression(paste("f(", sigma ,"|y)")))
plot(result3_sids$marginals.hyperpar$`Diagonal for CNTY_ID2`, type = "l", xlab = "d", ylab = "f(d|y)")
We can again visually see the fit of the predictions from the model to
the data,
nc.sids$fit_besag <- result3_sids$summary.fitted.values$mean*nc.sids$E74
plot(nc.sids["SID74"], breaks = seq(0,50, by = 5), main = "Observed counts")
plot(nc.sids["fit_besag"], breaks = seq(0,50, by = 5), main = "Predicted counts")
plot(nc.sids$SID74, ylab = "SIDS Counts")
points(nc.sids$E74*result3_sids$summary.fitted.values[,1], col = "purple", pch = 8, cex = 0.8)
When we compare the fit between models we can look at the marginal
log-likelihoods, DIC or WAIC.
result1_sids <- inla(SID74 ~ nwp_hat74, data = nc.sids,
family = "poisson",
control.compute = list(dic = TRUE, waic = TRUE),
control.predictor = list(compute = TRUE),
E = E74)
result2_sids <- inla(SID74 ~ nwp_hat74 + f(CNTY_ID2, model = "iid"),
data = nc.sids,
family = "poisson",
control.compute = list(dic = TRUE, waic = TRUE),
control.predictor = list(compute = TRUE),
E = E74)
result3_sids <- inla(SID74 ~ nwp_hat74 + f(CNTY_ID2, model = "besagproper", graph = B.nc),
data = as.data.frame(nc.sids),
family = "poisson",
control.compute = list(dic = TRUE, waic = TRUE),
control.predictor = list(compute = TRUE),
E = E74)
print(matrix(rbind(cbind(result1_sids$mlik[1], result1_sids$dic[1], result1_sids$waic[1]),
cbind(result2_sids$mlik[1], result2_sids$dic[1], result2_sids$waic[1]),
cbind(result3_sids$mlik[1], result3_sids$dic[1], result3_sids$waic[1])),
nrow = 3, ncol = 3,
dimnames = list(c("Model 1", "Model 2 - iid", "Model 3 - Besag"),
c("Marginal log-likelihood", "DIC", "WAIC")
)))
## Marginal log-likelihood DIC WAIC
## Model 1 -226.1261 441.6218 442.7613
## Model 2 - iid -227.215 431.0399 437.1806
## Model 3 - Besag -227.2152 431.8535 437.6031
Let’s visualise the fitted values of all the models.
nc.sids$fit_besag <- result3_sids$summary.fitted.values$mean*nc.sids$E74
nc.sids$fit_fixed <- result1_sids$summary.fitted.values$mean*nc.sids$E74
nc.sids$fit_iid <- result2_sids$summary.fitted.values$mean*nc.sids$E74
plot(nc.sids["SID74"], breaks = seq(0,50, by = 5), main = "Observed counts")
plot(nc.sids["fit_fixed"], breaks = seq(0,50, by = 5), main = "Predicted counts - fixed")
plot(nc.sids["fit_iid"], breaks = seq(0,50, by = 5), main = "Predicted counts - iid")
plot(nc.sids["fit_besag"], breaks = seq(0,50, by = 5), main = "Predicted counts - besag")
plot(nc.sids$SID74, ylab = "SIDS Counts")
points(nc.sids$E74*result1_sids$summary.fitted.values[,1], col = "blue", pch = 10)
points(nc.sids$E74*result2_sids$summary.fitted.values[,1], col = "red", pch = 19, cex = 0.5)
points(nc.sids$E74*result3_sids$summary.fitted.values[,1], col = "purple", pch = 8, cex = 0.8)
inla.doc("bym")
inla.doc("bym2")
Let’s revisit the NC SIDS dataset example.
Now we add a BYM effect for the different counties,
result4_sids <- inla(SID74 ~ nwp_hat74 + f(CNTY_ID2, model = "bym2", graph = B.nc),
data = as.data.frame(nc.sids),
family = "poisson",
control.compute = list(dic = TRUE, waic = TRUE),
control.predictor = list(compute = TRUE),
E = E74)
The posterior summary is
summary(result4_sids)
## Time used:
## Pre = 41.1, Running = 1.07, Post = 0.0951, Total = 42.2
## Fixed effects:
## mean sd 0.025quant 0.5quant 0.975quant mode kld
## (Intercept) -0.676 0.122 -0.922 -0.674 -0.441 -0.674 0
## nwp_hat74 1.955 0.317 1.347 1.949 2.596 1.949 0
##
## Random effects:
## Name Model
## CNTY_ID2 BYM2 model
##
## Model hyperparameters:
## mean sd 0.025quant 0.5quant 0.975quant mode
## Precision for CNTY_ID2 18.109 9.536 6.742 15.867 42.96 12.346
## Phi for CNTY_ID2 0.276 0.226 0.012 0.212 0.81 0.024
##
## Deviance Information Criterion (DIC) ...............: 427.64
## Deviance Information Criterion (DIC, saturated) ....: 122.21
## Effective number of parameters .....................: 25.54
##
## Watanabe-Akaike information criterion (WAIC) ...: 429.71
## Effective number of parameters .................: 22.31
##
## Marginal log-Likelihood: -170.41
## is computed
## Posterior summaries for the linear predictor and the fitted values are computed
## (Posterior marginals needs also 'control.compute=list(return.marginals.predictor=TRUE)')
The two hyperparameters from the BYM effect are the marginal standard deviation, \(\sigma\) and the weight parameter, \(\phi\), and their marginal posteriors are
res4_sd_bym <- inla.tmarginal(function(x) sqrt(1/x), result4_sids$marginals.hyperpar$`Precision for CNTY_ID2`)
plot(res4_sd_bym, type = "l", xlab = expression(sigma), ylab = expression(paste("f(", sigma ,"|y)")))
plot(result4_sids$marginals.hyperpar$`Phi for CNTY_ID2`, type = "l", xlab = expression(phi), ylab = expression(paste("f(", phi ,"|y)")))
We can again visually see the fit of the predictions from the model to
the data,
nc.sids$fit_bym <- result4_sids$summary.fitted.values$mean*nc.sids$E74
plot(nc.sids[c("SID74", "fit_fixed", "fit_iid", "fit_besag", "fit_bym")],
breaks = seq(0, 50, by = 5),
main = "Observed and predicted counts")
plot(nc.sids$SID74, ylab = "SIDS Counts")
points(nc.sids$E74*result1_sids$summary.fitted.values[,1], col = "blue", pch = 10)
points(nc.sids$E74*result2_sids$summary.fitted.values[,1], col = "red", pch = 19, cex = 0.5)
points(nc.sids$E74*result3_sids$summary.fitted.values[,1], col = "purple", pch = 8, cex = 0.8)
points(nc.sids$E74*result4_sids$summary.fitted.values[,1], col = "darkgreen", pch = 2, cex = 0.8)
We can visualise the spatial and specific effects as well.
nc.sids$bym <- result4_sids$summary.random$CNTY_ID2[1:max(nc.sids$CNTY_ID2),"mean"]
plot(nc.sids[c("bym")],
breaks = seq(-0.4,0.4, by = 0.05),
main = "BYM component")
nc.sids$bym_besag <- result4_sids$summary.random$CNTY_ID2[1:max(nc.sids$CNTY_ID2)+max(nc.sids$CNTY_ID2),"mean"]
plot(nc.sids[c("bym_besag")],
breaks = seq(-1,1, by = 0.1),
main = "Besag component")
nc.sids$bym_iid <- (result4_sids$summary.random$CNTY_ID2[1:max(nc.sids$CNTY_ID2),"mean"]*
sqrt(result4_sids$summary.hyperpar["Precision for CNTY_ID2", "mean"]) -
result4_sids$summary.random$CNTY_ID2[1:max(nc.sids$CNTY_ID2)+max(nc.sids$CNTY_ID2),"mean"]*
sqrt(result4_sids$summary.hyperpar["Phi for CNTY_ID2", "mean"]))/
(1-sqrt(result4_sids$summary.hyperpar["Phi for CNTY_ID2", "mean"]))
plot(nc.sids[c("bym_iid")],
breaks = seq(-3,3, by = 0.1),
main = "IID component")
When we compare the fit between models we can look at the marginal log-likelihoods, DIC or WAIC.
print(matrix(rbind(cbind(result1_sids$mlik[1], result1_sids$dic[1], result1_sids$waic[1]),
cbind(result2_sids$mlik[1], result2_sids$dic[1], result2_sids$waic[1]),
cbind(result3_sids$mlik[1], result3_sids$dic[1], result3_sids$waic[1]),
cbind(result4_sids$mlik[1], result4_sids$dic[1], result4_sids$waic[1])),
nrow = 4, ncol = 3,
dimnames = list(c("Model 1", "Model 2 - iid", "Model 3 - Besag", "Model 4 - BYM"),
c("Marginal log-likelihood", "DIC", "WAIC")
)))
## Marginal log-likelihood DIC WAIC
## Model 1 -226.1261 441.6218 442.7613
## Model 2 - iid -227.215 431.0399 437.1806
## Model 3 - Besag -227.2152 431.8535 437.6031
## Model 4 - BYM -170.6617 427.6361 429.7086
We can also do cross-validation as a means of measuring prediction performance (see https://arxiv.org/pdf/2210.04482).
result1_sids <- inla(SID74 ~ nwp_hat74, data = nc.sids,
family = "poisson",
control.compute = list(dic = TRUE, waic = TRUE,
cpo = TRUE, return.marginals = FALSE, control.gcpo = list(enable = TRUE,num.level.sets = 3)),
control.predictor = list(compute = TRUE),
E = E74)
result2_sids <- inla(SID74 ~ nwp_hat74 + f(CNTY_ID2, model = "iid"),
data = nc.sids,
family = "poisson",
control.compute = list(dic = TRUE, waic = TRUE,
cpo = TRUE, return.marginals = FALSE, control.gcpo = list(enable = TRUE,num.level.sets = 3)),
control.predictor = list(compute = TRUE),
E = E74)
result3_sids <- inla(SID74 ~ nwp_hat74 + f(CNTY_ID2, model = "besagproper", graph = B.nc),
data = as.data.frame(nc.sids),
family = "poisson",
control.compute = list(dic = TRUE, waic = TRUE,
cpo = TRUE, return.marginals = FALSE, control.gcpo = list(enable = TRUE,num.level.sets = 3)),
control.predictor = list(compute = TRUE),
E = E74)
result4_sids <- inla(SID74 ~ nwp_hat74 + f(CNTY_ID2, model = "bym2", graph = B.nc),
data = as.data.frame(nc.sids),
family = "poisson",
control.compute = list(dic = TRUE, waic = TRUE,
po = TRUE, return.marginals = FALSE, control.gcpo = list(enable = TRUE,num.level.sets = 3)),
control.predictor = list(compute = TRUE),
E = E74)
result1_sids$gcpo$groups[[1]]
## $idx
## [1] 1 38 78
##
## $corr
## [1] 1.0000000 0.9999960 0.9999973
result2_sids$gcpo$groups[[1]]
## $idx
## [1] 1 22 32 38 78 90
##
## $corr
## [1] 1.0000000 0.1693715 0.1693715 0.1663808 0.1663808 0.1693715
result3_sids$gcpo$groups[[1]]
## $idx
## [1] 1 2 18
##
## $corr
## [1] 1.0000000 0.3897119 0.3726388
result4_sids$gcpo$groups[[1]]
## $idx
## [1] 1 2 19
##
## $corr
## [1] 1.0000000 0.2199047 0.2022050
The results for the model selection and validation are:
print(matrix(rbind(cbind(result1_sids$mlik[1], result1_sids$dic[1], result1_sids$waic[1], mean(result1_sids$gcpo$gcpo)),
cbind(result2_sids$mlik[1], result2_sids$dic[1], result2_sids$waic[1], mean(result2_sids$gcpo$gcpo)),
cbind(result3_sids$mlik[1], result3_sids$dic[1], result3_sids$waic[1], mean(result3_sids$gcpo$gcpo)),
cbind(result4_sids$mlik[1], result4_sids$dic[1], result4_sids$waic[1], mean(result4_sids$gcpo$gcpo))),
nrow = 4, ncol = 4,
dimnames = list(c("Model 1", "Model 2 - iid", "Model 3 - Besag", "Model 4 - BYM"),
c("Marginal log-likelihood", "DIC", "WAIC", "G-CV score")
)))
## Marginal log-likelihood DIC WAIC G-CV score
## Model 1 -226.1261 441.6218 442.7613 0.1829267
## Model 2 - iid -227.215 431.0399 437.1806 0.1769652
## Model 3 - Besag -227.2152 431.8534 437.6031 0.1783804
## Model 4 - BYM -170.6617 427.6361 429.7086 0.1756431
We can do multivariate disease mapping when we take into account the effect of the presence or absence of other diseases simultaneously. These are treated as endogenous variables. \[\begin{eqnarray*} y_{i 1}|\lambda_{i1} &\sim& \text { Poisson }\left(E_{i 1} \lambda_{i 1}\right)\\ y_{i 2}|\lambda_{i2} &\sim& \text { Poisson }\left(E_{i 2} \lambda_{i 2}\right) \\ \log (\lambda_{i 1}) &=& m_{1} + \sum_{f = 1}^{F_1} \beta_f X_{i f} + \sum_{r=1}^{R_1} \rho^r(u_{i r}) + b_{i 1} + S_{i}\\ \log (\lambda_{i 2}) &=& m_{2} + \sum_{f = 1}^{F_2} \gamma_f Z_{i f} + \sum_{r=1}^{R_2} \xi^r(v_{i r})+ b_{i 2} + a \, S_{i}, \label{eq:jointmeanmodel} \end{eqnarray*}\]
# read data
merg.data <- readRDS("Data/MAP_data.RDS")
# The map
map <- ne_countries(returnclass = "sf")
# Extract the map and compile the shape
if (T){
names(map)[names(map) == "iso_a3"] <- "ISO3"
names(map)[names(map) == "name"] <- "NAME"
map <- map[order(map$name_long),]
map$name_long[grepl("Gambia", map$name_long)] <- "Gambia"
map$name_long[map$name_long == "Republic of the Congo"] <- "Congo"
map <- map[order(map$name_long),]
# Select only the common countries from the map
sub_map <- match(map$name_long , merg.data$Common.countries)
map$cc <- sub_map
map_2 <- subset(map,map$cc!="NA" )
map_2 <- map_2[order(map_2$name_long),]
map_2$D.E = merg.data$D.E
map_2$M.E = merg.data$M.E
map_2$M.cases = merg.data$M.cases
map_2$D.cases = merg.data$D.cases
map_2$M.pop = merg.data$M.pop
map_2$D.pop = merg.data$D.pop
library(cleangeo)
rr <- clgeo_CollectionReport(map_2)
summary(rr)
issues <- rr[rr$type == NA,]
map_2_c <- clgeo_Clean(map_2)
}
# Neighbourhood graph
nb <- poly2nb(map_2_c)
## although coordinates are longitude/latitude, st_intersects assumes that they
## are planar
nb2INLA("Day 2/map.adj",nb = nb)
g <- inla.read.graph(filename = "Day 2/map.adj")
map_2$ID = 1:length(map_2_c$level)
par(mar = c(0,0,0,0))
plot(map$geometry[map$continent == "Africa"], border = "grey")
plot(map_2_c, add = TRUE)
text(coordinates(map_2_c)[,1], coordinates(map_2_c)[,2], label = map_2$ID)
plot(nb, coords = map_2_c, add = TRUE, col = "blue", pch = 1, cex = 0.01)
#Edit the graph to manually connect 12 and 14 & 15 and 8
if (F){
ind = c(12,14,15,8)
g1 = g
g1$nnbs[ind]=g1$nnbs[ind] + 1
g1$nbs[[12]] = c(g1$nbs[[12]], 14)
g1$nbs[[14]] = c(g1$nbs[[14]], 12)
g1$nbs[[15]] = c(g1$nbs[[15]], 8)
g1$nbs[[8]] = c(g1$nbs[[8]], 15)
g1$cc$id = rep(1, g1$n)
g1$cc$n = 1
g1$cc$nodes = 1:g1$n
g1$cc$mean = rep(1, g1$n)
plot(g)
dev.off()
plot(g1)
}
#OR edit the neighbourhood
if (F){
nb1 <- nb
nb1[[12]] <- as.integer(c(14))
nb1[[14]] <- as.integer(c(nb[[14]], 12), length = length(nb[[14]])+1)
nb1[[8]] <- as.integer(c(nb[[8]], 15), length = length(nb[[8]])+1)
nb1[[15]] <- as.integer(c(nb[[15]], 8), length = length(nb[[15]])+1)
par(mar = c(0,0,0,0))
plot(map$geometry[map$continent == "Africa"], border = "grey")
plot(map_2_c, add = TRUE)
text(coordinates(map_2_c)[,1], coordinates(map_2_c)[,2], label = map_2_c$ID)
plot(nb1, coords = map_2_c, add = TRUE, col = "blue", pch = 1, cex = 0.01)
nb2INLA("Day 2/map1.adj",nb = nb1)
g1_alt <- inla.read.graph(filename = "Day 2/map1.adj")
plot(g1)
dev.off()
plot(g1_alt)
}
Now we can model Malaria and G6PD mean incidence jointly by sharing a common spatial field using besag2 (see https://doi.org/10.1098/rsos.230851).
b = length(merg.data$M.cases)
y1 = c(merg.data$M.cases , rep(NA,b))
y2 = c(rep(NA,b) , merg.data$D.cases)
E1 = c(merg.data$M.E,rep(NA,b))
E2 = c(rep(NA,b) , merg.data$D.E)
yy = cbind(y1,y2)
EE = cbind(E1, E2)
mu = c(rep(1, b) , rep(2,b))
b1 = c(1:b,rep(NA ,b))
b2 = c(rep(NA,b),1:b)
ID = c(1:b , rep(NA ,b))
ID_copy = c(rep(NA , b), 1:b + b)
m = as.factor(mu)
d = data.frame(yy, m, ID, ID_copy, b1, b2)
formula = yy ~ -1 + m + offset(log(E1))+ offset(log(E2))+
f(ID, model = "besag2", graph=g, scale.model = T)+
f(ID_copy, copy="ID", hyper = list(beta = list(fixed = TRUE)))+
f(b1 , model = "bym2", graph=g, scale.model=TRUE)+
f(b2, model = "bym2", graph=g, scale.model=TRUE)
r1 <- inla(formula,
family = c("poisson","poisson"),
data = d,
verbose = F,
control.predictor = list(compute = T))
round(r1$summary.fixed[,c(1:3,5)],3)
## mean sd 0.025quant 0.975quant
## m1 7.882 0.316 7.257 8.509
## m2 4.155 0.207 3.741 4.559
round(r1$summary.hyperpar[,c(1,3,5,6)],3)
## mean 0.025quant 0.975quant mode
## Precision for ID 31641.961 3964.354 98862.855 11522.015
## Scale paramter a for ID 1.063 0.573 1.791 0.945
## Precision for b1 0.410 0.231 0.661 0.374
## Phi for b1 0.184 0.013 0.565 0.036
## Precision for b2 1.045 0.556 1.803 0.912
## Phi for b2 0.188 0.018 0.552 0.055
We can plot the relative risk from the model.
#Fitted values
map_2$fit_besag2_m <- r1$summary.fitted.values[1:b,"mean"]/map_2$M.E
map_2$fit_besag2_g <- r1$summary.fitted.values[1:b+b,"mean"]/map_2$D.E
l <- leaflet(map_2) %>% addTiles()
pal <- colorNumeric(palette = "YlOrRd",
domain = seq(min(map_2$fit_besag2_m)-0.5,max(map_2$fit_besag2_m)+0.5 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$fit_besag2_m),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values= ~seq(min(map_2$fit_besag2_m)-0.5,max(map_2$fit_besag2_m)+0.5 , by = 0.1),
opacity = 0.5, title = "RR from besag2 model for malaria",
position = "topright")
pal <- colorNumeric(palette = "YlOrRd",
domain = seq(min(map_2$fit_besag2_g)-0.5,max(map_2$fit_besag2_g)+0.5 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$fit_besag2_g),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(min(map_2$fit_besag2_g)-0.5,max(map_2$fit_besag2_g)+0.5 , by = 0.1),
opacity = 0.5, title = "RR from besag2 model for G6PD",
position = "topright")
#Common spatial field
map_2$besag_field <- r1$summary.random$ID$mean[1:b]*r1$summary.hyperpar["Scale paramter a for ID", "mean"]
pal <- colorNumeric(palette = "YlOrRd",
domain = seq(min(map_2$besag_field)-0.5,max(map_2$besag_field)+0.5 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$besag_field),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(min(map_2$besag_field)-0.5,max(map_2$besag_field)+0.5 , by = 0.1),
opacity = 0.5, title = "Common spatial field",
position = "topright")
#Disease-specific BYM effects
map_2$bym2_m <- r1$summary.random$b1$mean[1:b]
map_2$bym2_g <- r1$summary.random$b2$mean[1:b]
pal <- colorNumeric(palette = "YlOrRd",
domain = seq(min(map_2$bym2_m)-0.5,max(map_2$bym2_m)+0.5 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$bym2_m),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(min(map_2$bym2_m)-0.5,max(map_2$bym2_m)+0.5 , by = 0.1),
opacity = 0.5, title = "BYM values for malaria",
position = "topright")
pal <- colorNumeric(palette = "YlOrRd",
domain = seq(min(map_2$bym2_g)-0.5,max(map_2$bym2_g)+0.5 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$bym2_g),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(min(map_2$bym2_g)-0.5,max(map_2$bym2_g)+0.5 , by = 0.1),
opacity = 0.5, title = "BYM values for G6PD",
position = "topright")
#Disease-specific Besag effects
map_2$bym2_m <- r1$summary.random$b1$mean[b+1:b]
map_2$bym2_g <- r1$summary.random$b2$mean[b+1:b]
pal <- colorNumeric(palette = "YlOrRd",
domain = seq(min(map_2$bym2_m)-0.5,max(map_2$bym2_m)+0.5 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$bym2_m),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(min(map_2$bym2_m)-0.5,max(map_2$bym2_m)+0.5 , by = 0.1),
opacity = 0.5, title = "Besag values for malaria",
position = "topright")
pal <- colorNumeric(palette = "YlOrRd",
domain = seq(min(map_2$bym2_g)-0.5,max(map_2$bym2_g)+0.5 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$bym2_g),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(min(map_2$bym2_g)-0.5,max(map_2$bym2_g)+0.5 , by = 0.1),
opacity = 0.5, title = "Besag values for G6PD",
position = "topright")
# the iid for malaria
map_2$vm = (r1$summary.random$b1$mean[1:21] * sqrt(r1$summary.hyperpar$mode[3]) -
r1$summary.random$b1$mean[22:42] * sqrt(r1$summary.hyperpar$mode[4])) /
sqrt(1 - r1$summary.hyperpar$mode[4])
range(map_2$vm)
pal <- colorNumeric(palette = "YlOrRd", domain = seq(min(map_2$vm)-0.2, max(map_2$vm) + 0.5 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$vm),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(min(map_2$vm)-0.2, max(map_2$vm) + 0.5 , by = 0.1),
opacity = 0.5, title = "IID of Malaria",
position = "topright")
# iid effect for g6pd
map_2$v = (r1$summary.random$b2$mean[1:21] * sqrt(r1$summary.hyperpar$mode[5]) -
r1$summary.random$b2$mean[22:42] * sqrt(r1$summary.hyperpar$mode[6])) /
sqrt(1 - r1$summary.hyperpar$mode[6])
pal <- colorNumeric(palette = "YlOrRd", domain = seq(min(map_2$v)-0.2, max(map_2$v) + 0.5 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$v),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(min(map_2$v)-0.2, max(map_2$v) + 0.5 , by = 0.1),
opacity = 0.5, title = "IID for G6PD",
position = "topright")
Now we can also do quantile modeling, instead of the mean risk we can model the high risk areas only and see if there are different covariate impacts for the higher levels of risk.
\[\begin{eqnarray}
y_{i 1}|\lambda_{i1} &\sim& \text { Poisson
}\left(E_{i1}\lambda_{i 1}\right) \notag\\
y_{i 2}|\lambda_{i2} &\sim& \text { Poisson
}\left(E_{i2}\lambda_{i 2}\right) \notag\\
\log (q_{i 1, \alpha_{1}}) &=& \eta_{i 1,\alpha_{1}} = m_{1} +
\sum_{f = 1}^{F_1} \beta_f X_{i f} + \sum_{r=1}^{R_1} \rho^r(u_{i r}) +
b_{i 1} + S_{i} \notag \\
\log (q_{i 2,\alpha_{2}}) &=& \eta_{i 2,\alpha_{2}} = m_{2} +
\sum_{f = 1}^{F_2} \gamma_f Z_{i f} + \sum_{r=1}^{R_2} \xi^r(v_{i r})+
b_{i 2} + c \, S_{i},\label{qa2}
\end{eqnarray}\]. In INLA we only need to add the special link
function as follows:
control.family = list(control.link = list(model = “quantile”, quantile =
alpha)).
First we model the malaria incidence quantile and thereafter the G6PD incidence, while ignoring a shared effect.
alpha0 = 0.8
r.m <- inla(formula = map_2$M.cases ~ 1 + offset(log(map_2$M.E)) +
f(b1, model = "bym2", scale.model = T,
graph = g) ,
family = "poisson",
data = merg.data,
control.predictor = list(compute = T),
control.family = list(control.link = list(model = "quantile",
quantile = alpha0)),
verbose = F)
round(r.m$summary.hyperpar[,c(1,3,5,6)],3)
## mean 0.025quant 0.975quant mode
## Precision for b1 2.045 1.072 3.480 1.818
## Phi for b1 0.338 0.076 0.695 0.255
# the quantile for G6PD only
r.d <- inla(formula = map_2$D.cases ~ 1 + offset(log(map_2$D.E))+
f(b2,model = "bym2", scale.model = T,
graph = g) ,
family = "poisson",
data = merg.data,
control.predictor = list(compute = T),
control.family = list(control.link = list(model = "quantile",
quantile = alpha0)),
verbose = F)
round(r.d$summary.hyperpar[,c(1,3,5,6)],3)
## mean 0.025quant 0.975quant mode
## Precision for b2 2.946 1.426 5.328 2.521
## Phi for b2 0.281 0.051 0.656 0.163
Visualize the effects.
# the iid EFFECT malaria
map_2$vm2 = (r.m$summary.random$b1$mean[1:21] * sqrt(r.m$summary.hyperpar$mode[1]) -
r.m$summary.random$b1$mean[22:42] * sqrt(r.m$summary.hyperpar$mode[2])) /
sqrt(1 - r.m$summary.hyperpar$mode[2])
pal <- colorNumeric(palette = "YlOrRd", domain = seq(-2.6,1.6 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$vm2),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(-2.6,1.6 , by = 0.1),
opacity = 0.5, title = "IID effect for G6PD",
position = "topright")
# the iid EFFECT g6
map_2$vd2 = (r.d$summary.random$b2$mean[1:21] * sqrt(r.d$summary.hyperpar$mode[1]) -
r.d$summary.random$b2$mean[22:42] * sqrt(r.d$summary.hyperpar$mode[2])) /
sqrt(1 - r.d$summary.hyperpar$mode[2])
pal <- colorNumeric(palette = "YlOrRd", domain = seq(-5.6,2.6 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$vd2),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(-5.6,2.6 , by = 0.1),
opacity = 0.5, title = "IID effect for G6PD",
position = "topright")
# the spatial effect for malaria
map_2$u.m2 = r.m$summary.random$b1$mean[22:42]
pal <- colorNumeric(palette = "YlOrRd", domain = seq(-5.6,2.6 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$u.m2),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(-5.6,2.6 , by = 0.1),
opacity = 0.5, title = "Besag effect for Malaria",
position = "topright")
# the spatial effect for G6PD
map_2$u.d2 = r.d$summary.random$b2$mean[22:42]
pal <- colorNumeric(palette = "YlOrRd", domain =seq(-5.6,2.6 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$u.d2),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~ seq(-5.6,2.6 , by = 0.1),
opacity = 0.5, title = "Besag effect for G6PD",
position = "topright")
# The relative risk for malaria
map_2$rrm = r.m$summary.fitted.values[,"mean"]/map_2$M.E
range(map_2$rrm)
## [1] 0.2501084 4.0811419
pal <- colorNumeric(palette = "YlOrRd",
domain = seq(0,6.3 , by = 0.1))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$rrm),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~seq(0,6.3 , by = 0.1),
opacity = 0.5, title = "RR",
position = "topright")
# The relative risk for G6PD
map_2$rrd = r.d$summary.fitted.values[,"mean"]/map_2$D.E
range(map_2$rrd)
## [1] 0.3225694 5.0984641
pal <- colorNumeric(palette = "YlOrRd", domain =seq(0, 6.3, length = 9))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$rrd),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~ seq(0, 6.3, length = 9),
opacity = 0.5, title = "RR",
position = "topright")
# The predicted cases for Malaria
map_2$Predicted.MM = r.m$summary.fitted.values$mean
pal <- colorNumeric(palette = "YlOrRd", domain =seq(0,40000, length =9))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$Predicted.MM),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~ seq(0,40000, length =9),
opacity = 0.5, title = "Predicted",
position = "topright")
#Observed
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$M.cases),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~ seq(0,40000, length =9),
opacity = 0.5, title = "Observed",
position = "topright")
# The predicted cases for G6PD
map_2$Predicted.DD = r.d$summary.fitted.values$mean
range(map_2$Predicted.DD)
## [1] 6.111768 423.076806
pal <- colorNumeric(palette = "YlOrRd", domain =seq(0,500 , by = 10))
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$Predicted.DD),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~ seq(0,500 , by = 10),
opacity = 0.5, title = "Predicted",
position = "topright")
l %>% addPolygons(color = "grey", weight = 1,
fillColor = ~pal(map_2$D.cases),
fillOpacity = 0.5) %>%
addLegend(pal = pal, values = ~ seq(0,500 , by = 10),
opacity = 0.5, title = "Observed",
position = "topright")
Now we can do a multivariate quantile disease mapping of these two diseases.
#Use the dataset and formula we already created for the joint mean modeling
formula = yy ~ -1 + m + offset(log(E1))+ offset(log(E2))+
f(ID, model = "besag2", graph=g, scale.model = T)+
f(ID_copy, copy="ID", hyper = list(beta = list(fixed = TRUE)))+
f(b1 , model = "bym2", graph=g, scale.model=TRUE)+
f(b2, model = "bym2", graph=g, scale.model=TRUE)
alpha0 = 0.7
r1 <- inla(formula,
family = c("poisson","poisson"),
control.family = list(list(control.link = list(model = "quantile",quantile = alpha0)),
list(control.link = list(model = "quantile",
quantile = alpha0))),
data = d,
verbose = F,
control.compute = list(dic = T, waic = T, cpo = T),
control.predictor = list(compute = T))
round(r1$summary.fixed[,c(1:3,5)],3)
## mean sd 0.025quant 0.975quant
## m1 7.897 0.315 7.274 8.522
## m2 4.211 0.202 3.805 4.607
round(r1$summary.hyperpar[,c(1,3,5,6)],3)
## mean 0.025quant 0.975quant mode
## Precision for ID 31958.097 3969.825 103145.631 11379.477
## Scale paramter a for ID 1.068 0.578 1.829 0.933
## Precision for b1 0.420 0.236 0.671 0.387
## Phi for b1 0.184 0.014 0.593 0.039
## Precision for b2 1.091 0.579 1.868 0.959
## Phi for b2 0.188 0.019 0.557 0.056
#Opposite quantiles
r1a <- inla(formula,
family = c("poisson","poisson"),
control.family = list(list(control.link = list(model = "quantile",quantile = alpha0)),
list(control.link = list(model = "quantile",
quantile = 1-alpha0))),
data = d,
verbose = F,
control.compute = list(dic = T, waic = T, cpo = T),
control.predictor = list(compute = T))
round(r1a$summary.fixed[,c(1:3,5)],3)
## mean sd 0.025quant 0.975quant
## m1 7.899 0.192 7.443 8.349
## m2 4.045 0.208 3.628 4.451
round(r1a$summary.hyperpar[,c(1,3,5,6)],3)
## mean 0.025quant 0.975quant mode
## Precision for ID 0.640 0.214 1.510 0.440
## Scale paramter a for ID 1.629 1.150 2.241 1.560
## Precision for b1 6.380 0.189 36.103 0.442
## Phi for b1 0.227 0.009 0.702 0.018
## Precision for b2 1.079 0.546 1.902 0.936
## Phi for b2 0.160 0.011 0.562 0.029
Maybe the besag2 is too strict because of the restriction a>0. We can add more flexibility with copying a besagproper model component as follows:
#High quantile of malaria and low quantile of G6PD
alpha0 = 0.7
d1 <- d
d1$ID_copy = c(rep(NA, b),1:b)
formula1 = yy ~ -1 + m + offset(log(E1))+ offset(log(E2))+
f(ID, model = "besagproper", graph=g) +
f(ID_copy, copy="ID", hyper = list(beta = list(fixed = FALSE))) +
f(b1 , model = "bym2", graph=g, scale.model=TRUE)+
f(b2, model = "bym2", graph=g, scale.model=TRUE)
r2a <- inla(formula1,
family = c("poisson","poisson"),
control.family = list(list(control.link = list(model = "quantile",quantile = alpha0)),
list(control.link = list(model = "quantile",
quantile = 1-alpha0))),
data = d1,
verbose = F,
control.compute = list(dic = T, waic = T, cpo = T),
control.predictor = list(compute = T))
round(r2a$summary.fixed[,c(1:3,5)],3)
## mean sd 0.025quant 0.975quant
## m1 7.892 0.404 7.088 8.696
## m2 4.044 0.338 3.358 4.712
round(r2a$summary.hyperpar[,c(1,3,5,6)],3)
## mean 0.025quant 0.975quant mode
## Precision for ID 2.392 0.090 14.060 0.218
## Diagonal for ID 1.358 0.176 4.598 0.487
## Precision for b1 0.544 0.273 0.967 0.470
## Phi for b1 0.183 0.012 0.638 0.031
## Precision for b2 2.312 0.254 8.590 0.695
## Phi for b2 0.209 0.014 0.675 0.038
## Beta for ID_copy 1.077 0.436 1.709 1.086
#Median joint modeling
alpha0 = 0.5
formula1 = yy ~ -1 + m + offset(log(E1))+ offset(log(E2))+
f(ID, model = "besagproper", graph=g) +
f(ID_copy, copy="ID", hyper = list(beta = list(fixed = FALSE))) +
f(b1 , model = "bym2", graph=g, scale.model=TRUE)+
f(b2, model = "bym2", graph=g, scale.model=TRUE)
r3a <- inla(formula1,
family = c("poisson","poisson"),
control.family = list(list(control.link = list(model = "quantile",quantile = alpha0)),
list(control.link = list(model = "quantile",
quantile = 1-alpha0))),
data = d1,
verbose = F,
control.compute = list(dic = T, waic = T, cpo = T),
control.predictor = list(compute = T))
round(r3a$summary.fixed[,c(1:3,5)],3)
## mean sd 0.025quant 0.975quant
## m1 7.877 0.373 7.132 8.621
## m2 4.129 0.300 3.517 4.718
round(r3a$summary.hyperpar[,c(1,3,5,6)],3)
## mean 0.025quant 0.975quant mode
## Precision for ID 13.045 0.067 87.991 0.127
## Diagonal for ID 1.325 0.136 4.896 0.378
## Precision for b1 0.509 0.247 0.902 0.444
## Phi for b1 0.187 0.012 0.646 0.033
## Precision for b2 2.285 0.142 9.531 0.382
## Phi for b2 0.207 0.014 0.675 0.036
## Beta for ID_copy 1.096 0.473 1.732 1.085